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ABSTRACT 


A  testing  strategy  which  involves  integrating  a  previously  validated 
module  into  a  software  system  is  described.  It  is  shown  that,  when  doing  the 
integration  testing,  it  is  not  enough  to  treat  the  previously  validated  module 
as  a  ^black  box",  for  otherwise  certain  integration  errors  may  go  undetected. 

For  example,  an  error  in  the  calling  program  may  cause  an  error  in  the  module's 
input  which  will  only  result  in  an  error  in  the  module's  output  along  certain 
paths  through  the  module.  The  results  indicate  that  such  errors  can  be  detected 
by  the  module  by  retesting  a  set  of  paths  whose  cardinality  depends  only  on  the 
dimensionality  of  the  module's  input  space,  rather  than  on  the  module's  path 
complexity . 
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AN  APPROACH  TO  RELIABLE  INTEGRATION  TESTING 


Allen  Haley 
Stuart  Zweben 


1.  Introduction 

While  program  testing  remains  the  most  widely  used  method  of  validating 
computer  software,  the  computer  field  is  still  in  the  unenviable  position  of 
lacking  practical,  effective  testing  methodologies.  Test  data  chosen  by  random 
or  ad  hoc  methods1 may  provide  at  least  some  method  of  testing  but  indicates 
little  as  to  how  well  tested  is  the  resulting  software.  Practical  testing 
strategies  which  attempt  to  satisfy  certain  necessary  conditions  (such  as 
statement  or  decision  coverage  approaches)  can  be  shewn  incapable  of  detecting 
many  errors  (see,  e.g.  [Hcwden,  76,  Westley,  79]).  Finally,  strategies  which 
attempt  to  guarantee  detection  of  wide  classes  of  errors  (e.g.,  path  oriented 
strategies)  require  too  much  testing  to  be  of  practical  value  especially  where 
large  programs  are  involved.  In  such  cases  one  must  look  for  ways  to  reduce  the 
amount  of  testing  required  without  placing  severe  penalties  on  the  amount  of 
confidence  in  the  correctness  of  the  tested  program. 

One  possible  approach  to  achieving  this  reduction  is  motivated  by 
considering  the  problem  of  program  development.  In  developing  the  solution  to  a 
large,  complex  problem,  it  is  customary  to  form  subdivisions  which  abstract 
interesting  aspects  of  the  total  solution.  These  subdivisions  might  then  be 
refined,  implemented,  and  tested  as  independent  units  of  the  total  system  and 
then  integrated  to  form  a  complete  working  solution  to  the  original  problem. 

When  viewing  the  integrated  program  as  the  object  to  be  tested,  it  may  well  be 
the  case  that  the  complexities  are  too  great  to  make  certain  testing  strategies 
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practical.  For  example,  consider  a  program  P  consisting  of  subprogram  PI 
containing  m  paths  followed  by  subprogram  P?  containing  n  paths.  The 
integrated  program  can  have  a  total  of  m  *  n  paths  (see  Figure  1  for  m=3  and 
n=4  ) ,  since  any  of  the  m  paths  in  PI  can  be  followed  by  any  of  the  n  paths 
in  P2  . 


Figure  1 

Integration  of  Subprograms  with  3  and  4  paths,  respectively 


In  the  course  of  developing  P  however,  it  may  well  be  the  case  that  both  PI 
and  P2  have  been  tested  separately.  It  would  be  desirable  if  the  correctness 
information  obtained  in  unit  testing  Pi  and  P2  could  be  used  in  validating 
P  .  If  the  individual  modules^-  do  not  contain  a  large  number  of  paths,  it  may 
in  fact  be  possible  to  test  all  possible  paths  in  each  module.  If  the 
additional  testing  required  at  integration  time  was  negligible  compared  to  the 
unit  testing  overhead  (for  example,  if  we  could  ignore  the  internal  control 
structure  of  a  tested  module  when  integrating  it),  the  result  would  be  a 
reduction  of  the  magnitude  of  the  testing  problem  from  0(m*n)  to  O(nrf-n)  . 
While  this  represents  in  some  sense  an  ideal  situation,  it  is  clear  that  with 
such  a  potential  for  complexity  reduction,  even  a  less  than  ideal  solution  might 
represent  a  considerable  improvement  and  yet  provide  a  substantial  degree  of 
practicality . 

Research  to  date  in  program  testing  has  concentrated  on  a  single  program 
unit.  There  is  nothing  inherently  wrong  or  undesirable  with  this  approach, 
since  in  order  to  be  able  to  say  anything  about  a  large  problem,  it  makes  sense 


i.  The  notion  of  a  module  has  been  characterized  in  many  different  ways,  and 
several  authors  have  proposed  criteria  for  what  constitutes  a  "good  module" 
[see,  e.g.  Parnas,  72,  Yourdon,  79,  Stevens,  74],  It  will  suit  our  purposes  to 
allow  a  module  to  be  any  single  entry,  single  exit  block  of  code  which  can 
contain  an  arbitrary  amount  of  internal  control  structure.  For  simplicity,  we 
may  represent  modules  as  subroutines  in  this  paper,  with  the  understanding  that 
the  ideas  presented  herein  are  not  meant  to  be  restricted  to  this  form  of 
modular  structure. 
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to  gain  an  understanding  of  the  (smaller)  units  of  which  the  problem  is 
composed.  However,  given  that  these  individual  units  must  eventually  work 
together  as  a  system,  we  must  be  conscious  of  potential  problems  that  might  be 
encountered  at  integration  time,  and  develop  testing  strategies  which  are 
sensitive  to  these  problems  as  well  as  those  of  a  single  program  unit. 

Thus,  the  justification  for  the  development  of  a  method  of  integrating 
independently  tested  modules  into  a  single  program  is  (1)  to  reduce  the  total 
testing  complexity,  and  (2)  to  make  the  testing  procedure  conform  to  the  way 
programs  are  developed. 

2 .  Integration  Testing  Philosophies 

There  are  basically  two  approaches  to  testing  a  set  of  modules  which 
form  the  overall  structure  of  a  system  —  top  down  and  bottom  up. 


/ 


1  {E  } 

2  {F  > 

3  (G  } 

4  (B.E  > 


5  {  C,  F, G  } 

6  (D  } 

7  (A,B,C,D,E,F,G 


System  Configuration 


Possible  Test  Sequence 


Figure  2 


In  bottom  up  testing,  individual  modules  are  first  tested  in  isolation 
from  one  another  using  their  own  sets  of  test  data.  When  groups  of  related 
modules  have  been  validated,  they  are  integrated  into  a  higher  level  unit 
(subsystem)  which  is  then  tested.  The  subsystem  tests  in  general  require  new 
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test  data,  since  different  inputs  and  outputs  are  involved  at  the  higher  level. 
Larger  and  larger  subsystems  are  combined  until  eventually  the  entire  system  is 
tested  as  a  unit.  Using  the  "system"  of  Figure  2,  a  possible  sequence  of  tests 
using  a  bottom  up  technique  would  be:  (1-7).  The  primary  disadvantages  of  this 
form  of  testing  are  the  variety  of  test  data  required  at  each  level  and  the 
increasing  complexity  of  the  subsystems  as  the  integration  proceeds. 

Top  down  testing,  on  the  other  hand,  involves  starting  with  the  highest 
level  component  of  the  system  and  proceeding  to  the  next  lower  level,  etc. 

Ising  Figure  2  as  a  model  once  again,  the  "higher  level"  test  should  ideally 
determine  that  A  functions  correctly,  knowing  only  the  abstract  purposes  of  B,  C 
and  D,  so  that  when  B,  C  and  D  are  eventually  implemented,  it  is  necessary  only 
to  show  that  they  achieve  their  abstract  purpose  (that  is,  theoretically  no 
integration  testing  is  required).  However,  our  ability  to  select  appropriate 
tests  of  A  based  on  these  abstractions  and  to  certify  the  correctness  of  A's 
output  on  these  tests  (which,  after  all,  requires  output  from  abstract,  as  yet 
unwritten  modules),  is  quite  limited.  We  are  much  more  likely  when  testing  A  to 
first  write  stubs  for  B,  C  and  D  which  do  nothing  more  than  indicate  that  the 
second  level  modules  have  in  fact  been  called,  and  later  embellish  B,  C  and  D  so 
that  they  can  produce  correct  output  for  some  very  small,  well  known  class  of 
possible  inputs.  While  this  method  helps  identify  the  appropriateness  of  the 
invocation  of  the  lower  level  modules  and  can  speed  up  the  completion  of  a 
preliminary  version  of  a  complex  system,  it  tends  to  mask  the  subtle 
interrelationships  between  the  components  until  all  are  completely  developed  and 
an  attempt  is  made  to  have  them  work  as  a  unit. 

Therefore,  one  can  say  that  even  if  a  top  down  testing  philosophy  is 


attempted,  integration  testing  will  be  necessary  after  the  lowest  level  modules 
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have  been  completed.  That  is,  some  mixture  of  top  down  and  bottom  up  testing  is 

j  probable. 

> 

! 


In  the  remainder  of  the  paper,  we  will  explore  the  issues  involved  when 
a  "correct"  module  (one  which  produces  the  appropriate  output  for  any  valid 
input)  is  integrated  into  a  larger  program  context,  with  the  goal  of  identifying 
testing  strategies  which  are  sensitive  to  integration  time  errors. 


3 .  Integration  Time  Errors 

In  order  to  be  able  to  characterize  the  effectiveness  of  any  testing 
approach,  it  is  necessary  to  identify  those  errors  which  are  of  interest  to  the 
strategy  under  consideration.  Any  finite  testing  procedure  is  known  to  be  faced 
with  certain  inherent  limitations  as  to  the  errors  it  is  capable  of  detecting. 
One  of  these  limitations  can  be  characterized  as  the  "coincidental  correctness" 
problem,  whereby  the  program  under  examination  happens  to  produce  the  sane 
results  as  the  (different)  desired  program  on  the  set  of  data  tested.  Thus,  a 
statement  such  as  X*=X+2  cannot  be  differentiated  from  X=X*2  if  the  only  test 
data  chosen  result  in  X“2  on  entry  to  the  statement.  Another  inherent 
limitation  of  any  finite  strategy  has  been  called  the  "missing  path"  problem. 
This  problem  arises,  for  example,  when  some  "special  case"  has  not  been 
appropriately  dealt  with.  Thus  there  may  be  some  special  action  to  be  taken 
only  "IF  X“l" ,  which  the  programmer  forgot  to  include.  If  none  of  the  test  data 
happen  to  set  this  condition,  the  missing  action  will  not  be  detected. 

Admitting  that  errors  due  to  coincidental  correctness  and  missing  paths 
may  go  undetected,  the  next  problem  is  to  try  to  classify  those  kinds  of  errors 
that  we  might  hope  to  detect.  One  proposal,  due  to  Howden  [Howden,  76], 
distinguishes  between 
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occurs  when  a  specific  input  follows  the  wrong  path  due  to  an  error  in  the 
control  flow  of  the  program.  A  computation  error  exists  when  a  specific  input 
follows  the  correct  path,  but  an  error  in  some  assignment  statement  causes  the 
wrong  function  to  be  computed  for  one  or  more  of  the  output  variables.  This 
classification  scheme  has  been  used  successfully  by  researchers  of  the  Domain 
Testing  Strategy  [White,  80].  The  Domain  Testing  Strategy  is  designed  to  detect 
domain  errors,  though  it  also  has  some  ability  to  detect  computation  errors. 

The  notion  of  domain  and  computation  errors  turns  out  to  be  useful  in 
characterizing  certain  types  of  integration  problems.  For  example,  consider  a 
module  M  which  has  been  thoroughly  validated,  say  by  some  "Hypothetical 
Testing  Strategy",  so  that  it  is  free  of  both  domain  and  computation  errors. 
Module  M  is  to  be  integrated  into  a  program  P-.  Assume  that  P  has  some 
computation  whose  result  (call  it  C  )  is  used  in  some  predicate  of  M  but  is 
not  utied  anywhere  else  in  the  program  (se.e  Figure  3) . 


P 


READ  Ip 
C  -  Ip 
CALL  M  ( 
Op  ■  Om 
PRINT  Op 


C, . . .Om) 


M 


IF  C  <  4 
THEN  Om-1 
ELSE  0n**2 


Figure  3 

Program  Containing  a  Computation  Used  Only 
in  a  Predicate  of  a  Previously  Tested  Module 
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Now  suppose  that  the  correct  computation  in  P  should  have  set  C  to  Ip+1 


In  validating  M  ,  we  may  have  ensured  that  M  produces  the  correct  output  no 
matter  which  branch  of  the  IF  statement  is  taken,  but  P  will  still  produce  the 
wrong  output  if  the  initial  value  Ip  is  such  that  36  Ip<4  4.  However,  if  we  do 
not  happen  to  choose  a  value  of  Ip  in  this  range  we  will  not  catch  the  error 
in  the  computation  statement.  Notice  that,  from  the  point  of  view  of  the 
program  P  ,  there  is  only  one  path  to  consider  (Read  Ip;  C-Ip;  CALL  M  (...); 
Op=Om;  PRINT  Op)  if  we  ignore  the  control  structure  of  the  module  ’/  .  Ideally, 
we  would  like  to  be  able  to  ignore  the  internal  structure  of  M  at  integration 
time  and  deal  only  with  P's  structure.  Yet  this  example  shows  that  we  must  do 
more  than  just  select  a  couple  of  values  of  Ip  and  examine  the  resulting 
values  of  Op  .  In  this  case,  if  we  were  to  analyze  the  integrated  program 
including  the  module's  control  structure,  we  would  notice  that  the  program 
contains  a  domain  error,  since  values  of  Ip  in  the  range  3-  Ip^4  follow  the 
wrong  path. 


Computation  errors  cause  another  problem  in  ignoring  the  validated 
module's  control  structure  at  integration  time.  Assume  that  the  program  contains 
an  incorrect  computation  whose  result  is  passed  to  the  validated  module. 
Further  assume  that  the  only  use  of  this  result  is  by  some  computation  in  the 
validated  module.  As  an  example,  suppose  P  is  the  same  as  in  Figure  3,  but  M 
is  changed  as  in  Figure  4. 


M 


IF  (condition) 
THEN  Orn  -  C 
ELSE  Om  *2 


Figure  ^ 

Module  Which  Transmits  a  Program  Computation  Error 


Assume  once  again  that  the  computation  in  P  should  set  C  equal  to 
Ip+1  instead  of  Ip  .  If  integration  test  data  were  chosen  which  never 
exercised  the  true  branch  of  the  condition  in  M  ,  then  the  resulting  value  of 
Om  would  always  be  2  and  the  error  in  the  computation  of  P  would  go 
undetected  by  simply  examining  the  output  of  the  program. 

These  two  examples  have  elements  in  common.  In  both  cases  there  is  an 
error  in  the  code  preceding  the  call  to  the  validated  module.  The  error  causes 
one  of  the  module's  inputs  to  have  an  incorrect  (not  invalid)  value;  it  is 
possible  for  the  error  in  the  module's  input  to  not  be  reflected  as  an  error  in 
the  module's  output,  since  transmission  of  the  error  to  an  output  may  be 
dependent  upon  the  particular  path  chosen  through  the  module.  It  is  therefore 
clear  that,  when  integrating  a  previously  validated  module,  one  needs  to  know 
more  than  just  that  the  module  is  correct.  If  information  relevant  to  the 
module's  internal  structure  is  ignored,  it  is  possible  for  both  domain  and 
computation  errors  in  the  integrated  program  to  go  undetected.  Therefore  it  is 
natural  to  ask  at  this  stage  "What,  in  addition  to  knowing  that  the  module  is 
correct,  will  allow  effective  integration  testing  to  be  done?". 

4 .  Detecting  Integration  Errors 

Two  approaches  to  answering  the  question  posed  at  the  end  of  the 
previous  section  are  suggested  by  the  examples  presented  in  that  section.  Since 
our  goal  is  to  detect  errors  in  the  module's  input,  we  could  simply  require  that 
all  Input  values  to  the  module  be  output  (along  with  the  normal  output  of  the 
calling  program).  This  technique  is  not  new,  as  programmers  often  print  out 
values  of  intermediate/temporary  variables.  However  it  is  often  hard  to  know 
whether  an  Intermediate  program  value  is  correct.  More  likely,  the  programmer 
is  only  interested  in  examining  the  final  outputs  of  the  (calling)  program. 


10 

Therefore,  we  consider  a  second  approach.  It  would  appear  that  the 
chief  problem  presented  in  the  previous  section  is  that  the  module's  output  may 
be  unaffected  by  the  error  in  the  calling  program.  This  section  therefore 
addresses  the  problem  of  determining  how  much  integration  testing  need  be  done 
in  the  module  in  order  to  ensure  that  an  error  in  the  module's  input  results  in 
an  error  in  the  module's  output. 

To  examine  this  problem  in  greater  detail  we  will  first  impose  the 
following  restrictions  on  the  module. 

1.  restrict  the  module  such  that  all  inputs  are  assigned  upon  entry,  and 
no  inputs  are  reassigned  later  in  the  module. 

2.  restrict  the  output  variables  such  that  all  output  variables  are 
assigned  at  the  end  of  the  module,  i.e.  on  a  given  path  after  the  first 
assignment  to  an  output  variable  no  more  assignments  may  be  made  to  program 
variables.  In  addition,  no  output  variable  of  the  module  can  be  used  as  a 
reference  within  the  module.  (This  implies  that  the  order  in  which  the 
output  variables  are  assigned  along  a  path  does  not  matter.) 

3.  restrict  all  computations  and  predicates  to  be  linear  with  respect  to 
the  module's  inputs. 


The  first  two  restrictions  serve  primarily  to  simplify  the  notation 
which  follows.  Since  any  program  can  be  written  so  that  it  conforms  these 
two  restrictions,  they  are  not  fundamental  limitations.  The  third  restriction, 
though  considerable,  makes  it  easier  to  model  the  computation  sequence  along  a 
path  in  the  program.  After  the  major  results  have  been  derived,  we  will  discuss 
the  relaxation  of  the  third  restriction. 
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1 


Given  these  restrictions,  a  program  can  be  modeled  in  the  following 

manner . 

Suppose  the  program  contains  m  input  variables  II,...,  Im, 

n  program  variables  Pl,...,Pn,  and  1  output  variables  01 . 0l. 

We  introduce  an  environment  vector,  V,  which  contains  the  current  value  of 
all  variables  at  some  point  of  execution. 


1 


value 

of 

11 

9 

value 

of 

Im 

value 

of 

PI 

9 

value 

of 

P*n 

value 

of 

01 

• 

value 

of 

o't 

The  1  represents  a  position  for  constants.  Its  need  will  become  apparent 
in  what  follows,  as  we  describe  computations  and  predicates  of  the  module 


in  terms  of  an  environment. 
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Example  1  m*4  n"3  X*2 


SUBROUTINE  M0DULE1(I1, 12 , 13, 1 4 ,01,02) 


1.  P1=I1+I2 

2.  P2=2*I1-I3+I4 

3.  IF  P1=0 

4.  THEN 

5.  P3=P2+3 

6.  ELSE 

7.  P3=P1+P2 

8.  ENDIF 

9.  01=P3 

10.  02=P1+P2+P3 

11.  RETURN 

12.  END 

if  fll,  12,  13,  I4j  =  (l,  -1,  2,  3 $ 

Then  after  executing  statement  1,  V  = 


1 

1 

-1 

2 

3 

0 

undef 

undef 

undef 

undef 


At  the  end  of  the  program  V  ■ 


\ 

/ 


A 
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A  computation  in  the  program  (which  by  assumption  is  linear)  will  be 
represented  as  a  1  +  m  +  n  +  jL  by  1  +  m  +  n  +  J,  matrix.  Intuitively,  a  row 
of  this  matrix  describes  the  effect  of  the  computation  on  an  individual  input, 
program,  or  output  variable. 

For  a  single  assignment  statement,  which  assigns  exactly  one  program  or 
output  variable,  the  matrix  is  just  an  identity  matrix  except  in  the  row 
corresponding  to  the  assigned  variable.  The  entries  in  this  row  contain  the 
coefficients  of  the  input  and  program  variables  which  appear  on  the  right  hand 
side  of  the  assignment  statement,  placed  in  the  appropriate  columns. 


Example  2  Using  the  subroutine  M0DULE1  in  example  1,  the  matrix  C 
corresponding  to  statement  2  is 


C(stmt  2) 


1  0  0 
0  10 
0  0  1 
0  0  0 
0  0  0 
0  0  0 
0  2  0 
0  0  0 
\  0  0  0 
10  0  0 


0  0  0  0 

0  0  0  0 

0  0  0  0 

10  0  0 
0  10  0 

0  0  10 

-110  0 
0  0  0  0 

0  0  0  0 

0  0  0  0 


0  0  0  \ 
0  0  0  \ 
0  0  0  l 
0  0  0 
0  0  0 
o  n  o 
0  0  0 
10  0 
0  10 
0  0  1/ 


while  that  for  statement  5  Is 


C(stmt  5) 


0 

1 

0 

0 

0 

0 

0 

0 

0 

0 


0  0  0 
0  0  0 
10  0 
0  10 
n  o  l 
0  0  0 
0  0  0 
0  0  0 
non 
non 


o 

o 

0 

0 

0 

1 

0 

0 

0 

n 


o 

o 

o 

n 

o 

o 

l 

l 

o 

0 


0 

0 

0 

0 

0 

n 

o 

o 

o 

o 


o 

n 

o 

0 

0 

0 

0 

0 

1 

0 


? ' 

0 

0 

2 1 


14 


A  sequence  of  computations  along  a  given  path  can  be  represented  as  the 
product  of  the  matrices  corresponding  to  the  individual  assignment  statements. 


Example  3  The  computation  sequence  corresponding  to  executing  statements  1,  2 
and  5  is 

C(stmt  5)  x  (C(stmt  2)  x  C(stmt  1)  )  “ 

/iooooooonn\ 
niooooooool 
ooinoooonoj 
oooioonono 
ooooinnono 
oiiofiooono 
020  -1  100000 
320  -1  100000 
0000000010 
\oooooooooi/ 


Given  this  model,  there  are  two  separate  ways  in  which  an  error  in  the 
input  can  be  transmitted  to  an  output  of  the  module. 


1.  An  error  in  the  input  can  cause  the  correct  path  (i.e.  the  same  path 
that  would  have  been  followed  had  there  been  no  error)  to  be  taken,  but  can 
result  in  an  incorrect  value  being  assigned  to  an  output  variable.  With 
respect  to  the  entire  integrated  program,  such  a  situation  can  be  viewed  as 
a  computation  error.  We  will  call  such  errors  "integration  time 
computation  errors". 


2.  An  error  in  an  input  can  cause  an  incorrect  path  to  be  taken  in  the 
module  which  in  turn  results  in  a  different  computation  being  performed  and 
hence  an  incorrect  value  in  an  output  variable.  With  respect  to  the 
integrated  program,  this  situation  can  be  viewed  as  a  domain  error.  We 
will  call  such  errors  "integration  time  domain  errors". 
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5 .  Detection  of  Integration  Time  Computation  Errors 

We  first  examine  the  methods  by  which  an  input  error  to  the  module  can 
be  transmitted  to  an  output  variable  along  a  given  path  (the  computation  error 
situation) .  Using  the  three  restrictions  introduced  earlier,  the  results  of  the 
computations  along  a  given  path  can  be  represented  by 

VF  =  C(OA  )....C(01  )C(*  )...  .C(1)V0 

where  C(A  ) .  ,C(1)  represent  the  computations  which  assign  program  variables 

along  the  path,  C(<9jI)  , . . . .  fC((Jl  )  represent  the  assignments  to  the  output 

variables,  VO  is  the  initial  environment  vector  and  VF  is  the  final 

o 

environment  vector  (the  result  of  the  computations).  The  above  expression  can 
be  condensed  by  matrix  multiplication  to 

VF  -  C(©)C(P)V0 

where  C(P)  ■  C(k)....C(l)  and  C(6)  =  C(0l) . . . .C(01) .  This  can  be  further  reduced  to 

VF  -  C  VO 

where  C  is  now  a  single  1  +  m+  n  +  i  by  1  +  m  +  n+  Jt  matrix  which 
represents  the  results  of  all  computations  performed  along  the  particular  path. 


2.  We  will  adopt  the  convention  that  it  is  permissable  to  multiply  undefined 
values  by  zero,  resulting  in  a  value  of  zero.  Multiplication  of  undefined 
values  by  any  nonzero  quantity  will  result  in  a  value  which  is  undefined,  (see 
Example  4) 


J 
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The  final  expression  for  each  program  and  output  variable  of  C  is  in  terms  of 
the  input  variables  and  constant  only,  thus  corresponding  to  performing  a 
symbolic  execution  of  the  particular  path. 


Example  4  Using  the  same  subroutine  as  in  Example  1,  with 
/il,  12,  13,  14}  =  |l,  -1,  2,  3),  we  have 


VO- 


l  1  \ 

(  1 

0 

0 

0 

n 

n 

n 

0 

0 

3  \ 

/  1  \ 

f  0 

1 

0 

n 

0 

0 

0 

0 

0 

n  \ 

-1  \ 

0 

0 

1 

0 

0 

n 

0 

n 

0 

n 

2 

0 

0 

0 

1 

n 

n 

0 

n 

0 

0 

3 

O 

0 

n 

0 

n 

1 

n 

0 

0 

0 

0 

undef 

0 

1 

1 

0 

0 

0 

0 

0 

n 

n 

undef 

0 

2 

0 

-1 

1 

0 

0 

0 

0 

0 

undef 

* 

3 

2 

0 

-1 

1 

0 

0 

0 

0 

n 

undef 

\  3 

2 

0 

-1 

1 

0 

0 

0 

0 

\  undef  * 

\3 

5 

1 

-2 

2 

n 

0 

0 

n 

n  / 

VF- 


1 

1 

-1 

2 

3 

n 

3 

6 

6 

9 


\ 


The  question  to  be  considered  is  "In  what  ways  can  an  error  in  VO  be 
transformed  into  an  error  in  VF  If  VO  is  incorrect  then  it  can  be 

represented  as 


VO  -  VO’  +  e 


e  /  0 


where  VO’  is  the  correct  initial  environment  vector,  and  e  is  the  initial  error 
term  in  the  environment  vector.  Since  the  initial  error  can  only  occur  in  the  input 
variables,  e  is  restricted  to  be  0  in  the  first  position,  and  0  in  the  last 
n  +  Jt  positions,  and  nonzero  in  at  least  one  position  from  2  to  m  +  1. 


-J 


17 


If  this  new  expression  is  substituted  into  the  expression  for  the  path 
environment  it  yields 

VF  -  C  (  VO'  +  e  ) 
or 

VF  -  C  VO'  +  C  e 

Since  the  erroneous  input  follows  the  same  path  through  the  module, 

C  VO'  represents  the  "correct"  final  environment  vector  (i.e.  the  same  C 
should  have  been  applied  to  VO'  ,  the  correct  initial  environment).  Therefore 
the  error  is  only  detectable  in  the  final  environment  vector  if  Ce  /  0  . 
However,  this  restriction  isn't  sufficient  to  ensure  detection.  This  is  because 
we  are  assuming  that,  as  a  result  of  executing  a  path  through  the  module,  only 
the  "output  variable  part"  of  the  final  environment  vector,  and  not  the  entire 
final  environment  vector,  is  available.  Therefore,  if  the  error  in  the  input  is 
to  be  detected,  then 

(Ce)  /0  3  (D 

elements  cr+n+2  thru  m+n+jL+1 


3.  As  a  notational  convention,  we  will  use  subscripts  to  express  a  subset  of 
elements  of  a  vector  or  matrix.  Thus,  condition  (1)  can  be  written  as 

(C  e)  /  0 

m+n+2 . . m+n+i+1 


Similarly,  if  we  wished  to  describe  the  "upper  left"  submatrix  of  C  containing 

the  first  y  rows  and  z  columns,  we  could  write  C-.  „  _  •.  _ 

■*■>•••  »y  x  * »z 


(Clearly,  C  e  is  never  equal  to  0  since  e  /  0  and  we  have 
restricted  the  module  so  that  it  cannot  reassign  its  input  variables.  Hence  at 
least  one  of  the  elements  in  rows  2  through  m  +  1  must  be  nonzero.) 

To  better  understand  the  meaning  of  this  restriction  that  C  e  not  be 
0  in  the  last  i.  positions,  let's  examine  the  C  matrix  in  greater  detail. 


C  can  be  considered  to  have  9  submatrices  of  the  following  form. 


C(l,l) 

C(l, 2) 

C(l,3) 

c  - 

C(2,l) 

C(2,2) 

C(2,3) 

C(3,l) 

C(3,2) 

C(3,3) 

i 

where : 

C(l,l)  is  an  m+1  by  m+1  matrix  which  describes  how  the  inputs  and 
constants  are  mapped  onto  the  inputs  and  constants.  (By  our  restrictions 
this  must  be  equal  to  the  identity  matrix.) 

C(l,2)  is  an  nri-1  by  n  matrix  which  describes  how  program  variables  are 
mapped  onto  the  inputs  and  constants.  (This  must  be  0  by  our 
restrictions . ) 


C(l,3)  is  an  m+1  by  1  matrix  which  describes  how  outputs  are  mapped 
onto  inputs  and  constants  (also  equal  to  0  ). 


C(2,l)  is  an  n  by  m+1  matrix  which  describes  how  inputs  and  constants  are 
mapped  onto  program  variables.  (This  submatrix  is  unrestricted  in  form.) 

C(2,2)  is  an  n  by  n  matrix  which  describes  how  program  variables  are  mapped 
onto  program  variables.  (This  submatrix  is  0  since  C  contains  the  results 
of  a  symbolic  execution  of  the  path  and  program  variables  in  their  final 
symbolic  form  are  defined  completely  in  terms  of  input  variables  ar.d  constants. 
The  only  possible  exception  is  the  row  corresponding  to  a  program  variable  which, 
is  not  defined  along  the  path.  Such  a  row  would  have  a  1  in  the  column 
corresponding  to  the  program  variable  and  zeros  elsewhere.) 

C(2,3)  is  an  n  by  I  matrix  which  describes  how  outputs  are  mapped  onto  program 
variatles  (also  equal  to  0  ). 

M. 

C( 3,1)  is  an  Jt  by  mU  matrix  which  describes  how  inputs  and  constants  are 
mapped  onto  output  variables.  (This  subma-trix  is  unrestricted  in  form.) 

C( 3 , 2)  is  an  i  by  n  matrix  which  describes  how  program  variables  ai'e  mapped 
or.to  outputs.  (This  submatrix  musj*  be  0  for  the  same  reasons  given  for 
C(2,2)  above.) 

# 

C(3,3)  is  an  l  by  £  matrix  which  describes  how  outputs  are  mapped  onto 
outputs.  (This  must  also  fce  0  ,  with  Che  exception  of  "identity  rows",  as 
described  in  C(2,2),  corresponding  to  output  variables  which  are  unassigned 


Example  5  Usine  the  C 

matrix  from 

Example 

A, 

we  have  the  following 

partition 

/  1 

n 

n  o 

n 

n 

n 

0 

0 

0  \ 

f  n 

1 

0  n 

0 

0 

0 

0 

0 

n  ) 

I  n 

0 

1  0 

n 

n 

0 

0 

n 

0 

0 

0 

0  1 

0 

n 

n 

n 

n 

0 

C  - 

n 

0 

0  0 

l 

a 

0 

0 

0 

n 

0 

1 

1  0 

0 

0 

0 

0 

n 

0 

0 

2 

n  -l 

l 

0 

n 

n 

n 

n 

l  3 

2 

0  -1 

i 

n 

n 

n 

0 

o  I 

\  3 

2 

n  -i 

l 

0 

0 

n 

0 

n  / 

\  3 

5 

1  -2 

2 

0 

0 

n 

0 

o  / 

If  we  now  return  to  the  question  of  transforming  an  error  in  the  input 
to  an  error  in  the  output  by  looking  at  the  breakdown,  we  note  that  the  only 
interesting  part  of  the  C  matrix  is  the  submatrix  C(3,l)  which  describes  how 
inputs  and  the  constant  are  mapped  onto  the  output  variables.  The  results  of 
the  computations  along  a  path  can  now  be  described  by 


VFl+m+'n+l , . . . ,  1+nrt-n+i  ^1+nH-n+l , . .  . ,  l+m+n+ J!  x  l,...,m+l  ^°1 , .  . .  ,nrf-l 

■  CO-OTOi . 


Again  introducing  the  error  term  we  get 


VF 


1+nrt-n+l , . . .  ,  1+m+n+J?  C  (3, 1)  (VO^  +  el,...,m+l^ 


or 


nri-1  + 


m+1 


J 


VFl+m+n+l, . . . ,  1+m+n+J?  =  c(3>1)voi 


t  •  *  •  t 
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so  that  the  error  can  only  be  detected  in  the  final  environment  vector  for  this 
path  if  C(3,l)elt>t>>n+1  ,0. 

We  can  further  note  that  the  constant  position  in  ei,.,  m+i  is  always 
zero.  That  is,  we  really  have  only  it  linearly  independent  "error  directions", 
corresponding  to  the  m  input  variables  of  the  module  (assuming  no  inherent 
relationships  among  these  inputs).  Therefore,  the  first  column  of  C(3,l) 
plays  no  part  in  determining  whether  CC.IjlJe^  m+l^  0*  Defining  C'(3,l) 
to  be  C(3,l)  without  the  first  column,  we  can  now  say  that  the.  input  error  can 
only  be  detected  in  the  final  environment  vector  for  the  path  if 

c’<3>i>n . o  • 

We  therefore  seek  a  feasible  path  for  which  it  is  possible  to  detect  an 
input  error  in  any  jf  the  m  directions  by  examining  the  values  of  the  output 
variables  in  the  final  environment  vector.  Such  a  path  must  "span  the  error 
space",  so  that  all  possible  error  directions  are  feasible  along  the  path. 

Since  the  dimension  of  the  error  space  and  input  space  are  the  same,  we  will 
call  such  a  path  input  space  spanning;. 

We  can  now  conclude 

Lemma  1?  Examination  of  the  outputs  of  a  module  on  any  test  exercising  an  input 
space  spanning  path  for  which 

•2,...,n*H  **^C,(3,l)  «2,...,nrt-l  "  <2> 

is  sufficient  to  detect  any  integration  time  computation  error  affecting  the 
module . 


From  linear  algebra,  we  know  that 

Lemma  2:  A  path  satisfies  condition  (2)  if  there  exists  an  m  x  m  matrix  M 
consisting  of  m  rows  of  C'(3,l)  such  that  | M J  t  0. 

Intuitively  a  path  satisfying  the  conditions  of  Lemma  2  transforms  the 
inputs  to  outputs  in  such  a  way  that  there  are  tr.  linearly  independent 
functions  of  the  inputs  computed  on  that  path. 

Unfortunately,  the  results  may  not  be  very  helpful  since 

a)  there  is  no  guarantee  that  such  a  path  exists  (in  particular,  if 
i<m  ,  as  in  example  1,  no  such  path  can  exist) ,  and 

b)  even  if  there  is  such  a  path,  finding  it,  or  even  showing  its 

existence,  may  require  a  substantial  amount  of  computation.  For 
example,  for  each  path  in  the  module,  there  may  be  j  matrices 

whose  determinants  must  be  checked. 

Example  6  Using  the  program  of  Example  1,  we  note  that  neither  path  is 
sensitive  to  all  possible  error  directions.  The  "then"  path,  analyzed  in 
Example  5,  gives 

C'(3,l)  ^  _2  which  clearly  contains  no  4x4  matrix  having 

nonzero  determinant.  It  is  easy  to  see  that  this  path  cannot  detect  an  error 

where  both  13  ai^a  14  are  incorrect  by  the  same  amount  k  .  In  such  a  case 

01  =  3+2*11  -  ( I 3+k)  +  (I4+k)  =  3+2*11  -  13  +  14 

02  =  3+5*11  +  12  -  2*(I3+k)  +  (I4+k)  -  3+5*11  +  12  -  2*13  +  2*14 

which  are  the  same  results  as  would  have  been  produced  with  correct  values  of 

13  and  14  . 

The  reader  can  verify  that  the  "else"  path  is  also  insensitive  to 
certain  error  directions. 
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However,  it  is  not  necessary  for  a  path  satisfying  Lemma  2  to  exist. 
Consider  an  m  x  m  matrix  11'  constructed  from  various  rows  in  C'(3,l) 
matrices  taken  from  different  input  space  spanning  paths  through  the  module. 

That  is,  M'  consists  of  the  results  of  some  subset  of  output  variables  along 
some  set  of  paths  through  the  module.  If  |m*|  +  0  ,  then  e2,...,m+l^  ^ 

M'e2  ...nrt-l^O'  Intuitively  this  means  that  we  only  need  to  find  m  linearly 
independent  functions  of  the  inputs  computed  somewhere  in  the  module,  even  if  we 
need  to  select  several  paths  to  find  them.  Of  course,  it  is  entirely  possible 
that  the  same  output  variable  will  need  to  be  examined  on  more  than  one  path  in 
order  to  obtain  this  set  of  functions.  We  can  now  conclude 

Lemma  3:  Examination  of  the  appropriate  output  variables  from  any  test 
exercising  those  paths  corresponding  to  the  construction  of  M'  as  defined 
above  is  sufficient  to  detect  any  integration  time  computation  error  affecting 
the  module . 

Example  7  SUBROUTINE  M0DULE2(I1,  12,  01,  02) 

IF  II  =  0 
THEN 

01  *  2*11+1 
02  =  11+2 

ELSE 

IF  12-0 
THEN 

01  =  12+1 
02  -  2*12+2 

ELSE 

01  =  11+12 
02-3 
ENDIF 


ENDIF 
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No  path  is  by  itself  sensitive  to  every  error  direction.  However  any  two 
paths  are  sensitive  to  all  error  directions  with  the  stipulation  that  if  the 
"else-else"  path  is  selected,  01  must  be  examined.  The  relevant  C'(3,l) 
matrices  for  this  example  are 


The  following  theorem  is  an  immediate  consequence  of  Lemma  3. 

Theorem  1:  In  order  to  detect  any  integration  time  computation  error  affecting 
a  previously  validated  module,  it  is  sufficient  to  test  at  most  m  input  space 
spanning  paths  from  the  module,  chosen  so  as  to  guarantee  the  existence  of  M' 
defined  above. 

1'roof :  A  path  need  not  be  chosen  for  inclusion  in  the  integration  test  set 
unless  it  is  sensitive  to  some  error  direction  that  no  other  paths  in  the  test 
set  are  sensitive  to.  The  result  then  follows  from  the  fact  that  there  are  only 
m  error  directions  to  begin  with. 

In  many  cases  fewer  than  m  paths  will  be  required  since  one  or  more 
paths  may  contribute  multiple  rows  to  M’  .  If  the  particular  module  being 
examined  has  at  least  as  many  outputs  as  inputs  it  is  possible  that  a  single 
path  will  be  sufficient  to  pass  all  possible  errors  in  the  input  to  errors  in 
the  output. 

Theorem  1  still  leaves  the  following  open  problems! 
a)  How  does  one  find  the  proper  C'(3,l)  rows  in  order  to  build  the  m  x  m 
non-singular  matrix?  For  a  given  program  with  p  paths,  A  outputs,  and 

A  x  p 

m 


m  inputs  there  are 


I 
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different  candidates  for  M' .  Clearly  some  path  selection  method  must  be 
used  to  reduce  this  large  number  of  candidates. 

b)  The  possibility  exists  that  no  m  x  m  non-singular  matrix  exists  (i.e. 
all  paths  through  the  module  are  "blind"  to  one  or  more  error  directions) . 

Is  there  some  method  to  discover  this  without  looking  at  all  candidates  for 
the  m  x  m  non-singular  matrix?  (The  reader  should  verify  that  the 
program  of  Example  1  is  in  fact  completely  blind  to  the  error  direction 
discussed  in  Example  6.)  We  will  call  those  integration  time  computation 
errors,  which  are  in  directions  which  all  paths  are  not  blind  to, 

"detectable  integration  time  computation  errors". 

c)  The  requirement  that  paths  be  "input  space  spanning"  needs  further 

examination.  Equality  predicates,  or  combinations  of  predicates  which 
imply  an  equality  condition  (such  as  A  —  B  and  B  ),  along  a  given 

path  make  it  impossible  for  certain  types  of  integration  errors  to  be 
detected  since  the  correct  and  erroneous  Inputs  can  never  follow  the  same 
path.  However  the  computations  along  that  path  might  still  satisfy  the 
conditions  of  Lemma  2. 


For  example,  consider  a  module  with  inputs  I  and  J,  outputs  Oj  and  C>2) 
and  a  program  segment 

IF  1  -  1 
THEM 

Oj  -  I  +  J 
C>2  •  2  +  J 

ELSE 


I 
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Examination  of  just  the  computations  assigning  0^  and  0^  might  lead  us  to 
conclude  that  this  "then"  path  is  sufficient  to  detect  any  error  in  I  and  J 
(sincej^  1*0),  while  in  reality  it's  only  powerful  enough  to  detect  errors 
in  the  input  when  both  I  is  equal  to  1  and  I  should  be  equal  to  1.  In  effect, 
there  are  no  input  errors  in  I  which  cause  this  path  to  be  followed  when  it  is 
correct  to  follow  this  path.  We  should,  therefore,  restate  Lemmas  1  and  2  to 
say  that  such  a  path  is  powerful  enough  to  detect  any  input  error  for  which  the 
path  is  still  the  correct  one  to  be  followed.  All  other  input  errors  really 
manifest  themselves  as  integration  time  domain  errors  from  the  point  of  view  of 
this  path.  Integration  time  domain  errors  are  studied  in  the  following  section. 

We  conclude  this  section  by  noting  that,  since  there  are  only  m 
linearly  independent  error  directions  to  be  covered. 

Theorem  2:  A  set  of  at  most  m  "beneficial"  paths  (where  a  path  is  added  to 
the  set  iff  it  covers  a  new  error  direction)  will  suffice  to  transmit  any 
detectable  integration  time  computation  error  in  an  input  to  a  correct  module  to 
some  output  of  the  module. 

If  the  paths  are  input  space  spanning,  we  need  only  compute  the  matrix 
information  discussed  previously.  If  not,  we  must  further  ensure  that  an  error 
direction  which  we  intend  a  path  to  cover  is  in  fact  feasible  along  that  path. 

6 •  Detection  of  Integration  Time  Domain  Errors 

We  now  address  the  second  type  of  integration  error  presented  in  Section 
4,  that  of  an  error  in  the  calling  program  which  causes  the  (incorrect)  input  to 
the  module  to  follow  a  different  path.  This  situation  comes  about  when  the 
error  causes  one  of  the  module’s  predicates  to  have  an  interpretation  which  is 
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nonequivalent  to  that  which  correct  input  would  have  produced. 


The  model  of  Section  4  needs  to  be  expanded  slightly  In  order  to 
incorporate  the  idea  of  a  predicate  interpretation.  Given  a  particular 
predicate  T  in  the  module  that  is  under  examination,  and  a  path  leading  to 
that  predicate,  that  predicate  interpretation  can  be  modeled  as 


0  .relop.  Tc  C  VO 


where  VO 
C 

T 


.re lop. 


represents  the  initial  environment  vector  (as  before) 
represents  the  results  of  the  computations  in  the  module 
along  the  path  leading  to  the  predicate. 

represents  the  predicate  which  when  applied  to  C  VO  yields 
a  scalar  which  is  compared  to  zero  to  determine  whether  to 
branch  or  not.  i.e.,  the  elements  of  T  contain  the 
coefficients  of  the  constant  and  variables  used  in  the 
predicate.  (The  transpose  of  T  is  required  in  this  scalar 
product  since  T  is  a  column  vector.) 
is  any  relational  operator  which  determines  the  type  of 
comparison  being  made. 


I 

I 

I 
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Example  8  Using  the  module  of  Example  1  again,  the  predicate  PI  ^  0  can  be 
represented  as 


Since  there  is  only  one  path  leading  to 


this  predicate. 


f 


\ 


1  0  0  0  0  0 

0  1  0  0  0  0 

0  0  1  0  0  0 

0  0  0  1  0  0 

0  0  0  0  1  0 

0  110  0  0 

0  2  0  -1  1  0 

oooooo 

0  0  0  0  0  0 

oooooo 


0  0 
0  0 
n  o 
0  0 
0  0 
0  o 
0  0 
0  0 
0  0 
0  0 


0  0  \ 

0  0  \ 
0  0  \ 
0  0  \ 
0  0  I 
0  0 

0  0  / 
0  0  | 
0  0/ 
0  0/ 


1 

value  of  II 
value  of  12 
value  of  13 
value  of  14 
undefined 
undefined 
undefined 
undefined 
undefined 


\ 


Thus  the  interpretation  of  the  predicate  is 

0  «  T tC  V0 
or 

0  *  11  +  12 

which,  when  evaluated  for  fll,  12,  13,  14^  *  jl,  -1,  2,  3^, 
yields 

0^0  or  TRUE. 
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In  order  for  an  error  in  the  calling  program  to  produce  nr,nequivalent 
predicate  interpretation  along  some  given  path  in  the  correct  module,  it  is 
necessary  for  the  following  condition  to  hold. 

^A/  0  3 

I/O  -=»TtC  (VO  +  1)  +  Jk  1*0  VO  (2) 

where  e  represents  the  error  vector.  That  is,  the  erroneous  interpretation  of 
the  predicate  s’-ould  not  be  a  multiple  of  the  correct  interpretation,  for 
otherwise  the  path  taken  by  any  input  reaching  this  predicate  along  the  given 
path  will  not  be  altered. 


Example  9 

Using  the  subroutine  of  Example  1  once  again,  we  consider  the 
predicate  Pl^  0  ,  whose  interpretation  is  II  +  12  rt  0.  If  an  error  in 
the  calling  module  causes  II  and  12  to  be  modified  (to  II'  and  12') 
in  such  a  way  that  II’  +  12’  =  A (II  +  12)  for  some  A  /0  ,  then  the 
Interpretation  of  this  predicate  (i.e.  jL  (II  +  12)  i  0  )  is  equivalent  to 
the  original.  As  an  example  of  such  a  situation,  suppose  the  calling 
program  had  inputs  X,  Y,  and  Z  and  further  suppose  that  computations  in  the 
calling  program  should  have  set 

11  -  X  +  2*Y 

12  -  2*Z 

but  erroneously  set 

11  -  2*X  +  Z 

12  ■■  3*Z  +  4*Y 

Then,  ir.  terms  of  the  inputs  to  the  calling  module,  the 
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interpretation  of  PI  —  0  should  have  been  X  +  2*Y  +  2*Z^  0  ,  but  instead  is 
2*X  +  4*Y  +  4*Z  “  2*(  X  +  2*Y  +  2*Z  0.  Both  the  correct  and  incorrect 

interpretations  evaluate  identically  for  any  triplet  (X,  Y,  Z) . 


As  in  the  previous  section,  we  must  be  cognizant  of  the  effect  non  input 
space  spanning  paths  have  on  the  ability  of  a  predicate  to  undergo  a  charge  in 
int  erpret  at  j  c.n . 


Example  10  The  predicate  PI ^  0  from  Example  1,  whose  interpretation  is 
II  +  12  ^  0  i  appears  to  be  sensitive  to  changes  in  the  input  II  as  long 
as  its  new  interpretation  is  not.  A(Ii  +  12)^  0  .  However,  suppose  that 
this  predicate  appeared  along  a  path  in  the  module  previously  constrained 
by  a  predicate  II  *  1  .  Since  PI  0  cannot  even  be  reached  if  II  t  1, 
it  can  be  said  to  have  no  interpretation  under  these  conditions.  Hence, 
its  interpretation  cannot  be  affected  by  a  change  in  II  ,  so  that  PI  is 
no  longer  helpful  in  identifying  any  errors  in  II  . 


Let  us  now  examine  the  C  matrix  in  greater  detail.  The  C  matrix  can 
be  considered  as  9  submatric.es  (as  before).  The  only  vubnatrix  of  interest 
this  time  is  C(2,l)  (the  submetrix  which  defines  how  inputs  are  mapped  onto 
program  variables) ,  as  only  input  and  program  variables  can  be  used  in  the 
predicate.  Likewise,  the  only  interesting  positions  of  T  are  the  first 
oi  +  n  +  1  positions  (all  the  re3t  must  be  0  ).  Therefore  if  condition  (3)  is 
grouped  in  the  following  way. 
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jU*  o  9 

e  +  0  (TtC)  (VO  +  e)  (TcC)  VO 

then  the  expression  (^C)  is  the  predicate  interpretation  vector  of  which  only 
the  first  m  +  1  positions  can  be  non-zero.  These  positions  represent  the 

manner  in  which  input  variables  are  mapped  onto  the  predicate  scalar.  Hence  we 
can  expand  condition  (3)  as  follows. 


A  i  09  e  />  0 

«iC«i . +  <«tc>l . «H*1 . «fl>  * 


. •«> 

where  the  subscripts  indicate  that  just  the  first  m  +  1 

vector  are  being  used.  As  a  result,  we  can  conclude 


positons  of  each 


Lemma  4:  Under  restrictions  1-3  of  Section  4,  in  order  to  ensure  that  an  error 
in  the  calling  program  produces  a  nonequivalent  predicate  interpretation  along 
some  given  path  in  the  correct  module,  it  is  sufficient  that  there  exist  a 
predicate  T  in  the  module  and  an  input  space  spanning  path  leading  to  T 
satisfying 


V . *  °  . *i . ,‘A(?tc>i . ^ivoi . 

We  are  now  faced  with  the  problem  of  how  to  detect  such  changes  in  a  predicate 
interpretation.  Fortunately,  the  Domain  Testing  Strategy  [White,  80]  provides 
an  answer  to  this  question.  By  selecting  a  small  number  of  test  points  at  or 
near  the  border  of  the  input  space  corresponding  to  the  predicate  inter¬ 
pretation,  we  can  guarantee  that  essentially  all  changes  (up  to  a  parameter  £  ) 
in  interpretation  are  detected.  In  <  _*der  to  use  this  result,  it  is  necessary  to 
assume  that  adjacent  regions  of  the  module's  input  space  compute  different 
functions  which  do  not  give  "coincidentally  identical"  results  on  the  set  of 
test  data  chosen.  With  this  in  mind,  we  have 


A 
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Lemma  s :  By  retesting  a  predicate  statisfying  Lemma  4  using  the  Domain 
Testing  Strategy,  it  is  possible  to  detect  any  integration 
time  domain  error  which  results  in  a  change  of  magnitude 
greater  than  £  in  the  predicate  interpretation. 

As  in  the  previous  section,  we  note  that 

a)  there  is  in  general  no  predicate  satisfying  the  conditions  of  Lemma  4, 
and 

b)  since  e^  =  0  ,  there  are  really  only  m  linearly  independent  error 
directions . 


However,  if  we  can  select  a  set  of  m  predicate  interpretations 

lr(rtc)  1  from  input  space  spanning  paths  such 

2, .  . .  ,nri-lJiJ  i=l 

that  the  m  x  m  matrix  M"  whose  i^  row  consists  of  C(TtC)  1. 

2  $  •  •  •  1  i 

has  a  non-zero  determinant,  then 


t  4  3 

•2,...,«+l  *  °  ='5>[(Ttc)2,...>m+l)i  e2,...,m+l  *M(Ttc>2, 

We  can  therefore  conclude 


.,nri-l]i  v02, 
for  all  i  =1 


,m+l 
.  ,m 


Theorem  3:  By  retesting,  using  the  Domain  Testing  Strategy,  a  set  of 

input  space  spanning  paths  through  a  correct  module  such  that  the  set 
contains  m  predicate  interpretations  satisfying  the  condition  on  M" 
defined  above,  it  is  possible  to  detect  any  integration  time  domain 
error  in  the  module  of  Domain  Testing  consequence  }  £  . 

As  was  the  case  in  the  previous  section,  we  have  no  assurance  that  we 


can  ever  find  such  a  set  of  predicate  interpretations,  nor  do  we  have  a 
computationally  efficient  method  of  finding  them  if  they  exist. 
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We  also  note,  as  before,  that  the  "input  space  spanning"  requirement 
appears  to  impose  additional  requirements  on  the  paths  we  can  choose.  However, 
as  before,  a  set  of  at  most  m  "helpful"  paths,  each  of  which  may  itself  be 
insensitive  to  certain  error  directions,  but  which  covers  an  error  direction 
that  the  other  paths  don't,  will  suffice  to  detect  any  "detectable"  integration 
time  domain  error. 

7 .  The  Linearity  Requirements 

The  results  presented  in  the  previous  sections  were  developed  under  the 
very  restrictive  assumption  that  all  computations  and  predicates  were  linear 
with  respect  to  the  module's  inputs.  However,  that  assumption  was  not  used  in 
its  entirety.  For  example,  we  can  apply  the  results  for  detecting  integration 
time  computation  errors  as  long  as  we  can  find  a  set  of  paths  which  produce 
enough  linear  functions  of  the  module's  inputs  so  that  the  nonsingular  matrix 
M'  can  be  produced.  It  is  certainly  more  realistic  to  assume  that  some 
functions  computed  along  some  paths  in  the  module  are  linear. 

For  integration  time  domain  errors,  it  is  also  not  necessary  to  have  all 
computations  and  predicates  be  linear.  Rather,  we  require  that  enough  paths 
containing  linear  predicate  interpretations  be  found  so  that  the  matrix  M"  can 
be  constructed.  Note  that  this  restriction  is  even  weaker  than  that  of  a 
"linearly  domained"  module  [Zeil,  81}  ,  where  every  predicate  is  required  to 
have  a  linear  interpretation. 

Final  Remarks 

We  have  shown  that,  if  an  error  exists  in  a  computation  preceding  a  call 
to  a  previously  validated  module,  and  if  that  error  results  in  an  error  in  the 
module's  input,  it  is  theoretically  possible  to  detect  this  condition  without 
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having  to  deal  with  the  entire  path  complexity  of  the  module.  In  fact,  the 
maximum  number  of  p  aths  that  need  be  retested  depends  only  on  the  dimensionality 
of  the  module's  input  space.  Sufficient  conditions  on  this  set  of  paths  have 
also  been  presented. 

It  remains  to  be  shown  whether  computationally  efficient  methods  exist 
for  finding  the  right  set  of  paths  to  be  tested.  From  the  development  of  the 
results,  "input  space  spanning"  paths  seem  to  offer  the  most  promise.  This 
suggests  that  choosing  false  branches  of  equality  predicates  and  strict 
inequality  branches  of  other  predicates  may  be  a  useful  heuristic  in  attempting 
to  cover  all  error  directions. 

We  have  noted  the  possibility  that  no  set  of  paths  satisfying  the 
retesting  conditions  exists.  One  may  be  tempted  to  argue  that  the  likelihood  of 
such  a  phenomenon  is  small,  since  the  set  of  functions  and  predicates 
constraining  the  error  directions  has  measure  zero  with  respect  to  the  error 
space  under  the  standard  "equal  likelihood"  assumptions.  However,  our  intuition 
tells  us  that  in  real  programs  the  equal  likelihood  assumption  does  not  apply. 

We  can  further  note  that  even  if  we  can  determine  the  existence  of  a  set 
of  paths  which,  when  retested,  will  be  sure  to  transmit  any  input  error  to  the 
output  of  the  module,  we  must  now  face  the  problem  of  generating  test  data  for 
the  calling  program  which  will  exercise  these  paths.  Our  ability  to  generate 
such  test  data  may  well  be  constrained  by  the  structure  of  the  calling  program. 
That  is,  we  may  only  be  able  to  generate  data  which  exercise  the  key  paths  in 
the  correct  module  if  we  follow  certain  paths  in  the  calling  program.  This  may 
inhibit  our  ability  to  thoroughly  test  certain  parts  of  the  calling  program. 
Experiments  performed  on  real  programs  should  provide  useful  answers  as  to  the 


severity  of  these  problems. 

We  have  approached  the  problem  of  integration  testing  in  this  paper  from 
a  "bottom  up"  point  of  view,  in  that  we  were  concerned  with  the  integration  of  a 
previously  tested  module  into  a  higher  level  software  unit.  The  resulting 
integration  testing  strategy  involves  selecting  carefully  chosen  paths  in  the 
(completely  developed)  module  for  retesting.  However,  the  results  of  this 
investigation  suggest  that  one  might  equally  well  have  explored  the  problem  from 
the  "top  down"  point  of  view.  That  is,  we  might  explore  the  idea  of  using  the 
notion  of  "linearly  independent"  "input  space  spanning"  paths  in  stub 
development,  for  it  is  at  this  point  in  top-down  development  that  we  are  really 
concerned  with  error  conditions  in  the  higher  level  software  unit. 

Finally,  we  note  that  errors  in  the  higner  level  unit  which  result  in 
incorrect  module  input  values  constitute  only  one  class  of  integration  problems. 
For  example,  we  might  have  errors  (in  the  calling  module)  in  the  code  following 
the  call  to  the  correct  module.  The  ability  to  detect  these  errors,  however, 
may  depend  on  which  path  through  the  module  was  followed,  since  the  program 
statement  in  error  may  involve  output  variables  from  the  correct  module.  It 
remains  to  be  shown  whether  the  paths  to  be  selected  to  catch  input  errors  to 
the  module  are  in  any  way  related  to  the  paths  required  to  detect  these  errors. 

Despite  these  problems,  formidable  as  they  are,  it  is  comforting  to  have 
the  intuitively  appealing  result  that,  from  a  theoretical  standpoint,  it  is  only 
necessary  to  retest  a  small  number  of  possible  paths  through  a  correct  module  in 
order  to  detect  certain  integration  errors.  It  is  our  hope  that  these  results 
can  form  the  basis  of  a  more  unified  and  systematic  approach  to  integration 
testing,  so  that  some  form  of  (partial)  certification  of  a  software  system  may 
be  possible. 
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